This dataset was scraped from nextspaceflight.com and includes all the space missions since the beginning of Space Race between the USA and the Soviet Union in 1957!
%pip install iso3166
Requirement already satisfied: iso3166 in c:\users\dacs\anaconda3\lib\site-packages (2.0.2) Note: you may need to restart the kernel to use updated packages.
%pip install --upgrade plotly
Requirement already satisfied: plotly in c:\users\dacs\anaconda3\lib\site-packages (5.5.0) Requirement already satisfied: tenacity>=6.2.0 in c:\users\dacs\anaconda3\lib\site-packages (from plotly) (8.0.1) Requirement already satisfied: six in c:\users\dacs\anaconda3\lib\site-packages (from plotly) (1.16.0) Note: you may need to restart the kernel to use updated packages.
import numpy as np
import pandas as pd
import plotly.express as px
import matplotlib.pyplot as plt
import seaborn as sns
from iso3166 import countries
from datetime import datetime, timedelta
pd.options.display.float_format = '{:,.2f}'.format
df_data = pd.read_csv('mission_launches.csv')
df_data.shape # The shape is 4324 Rows and 9 Columns.
df_data.columns
Index(['Unnamed: 0', 'Unnamed: 0.1', 'Organisation', 'Location', 'Date',
'Detail', 'Rocket_Status', 'Price', 'Mission_Status'],
dtype='object')
df_data.isna().values.any()
True
df_data.isna().sum()
Unnamed: 0 0 Unnamed: 0.1 0 Organisation 0 Location 0 Date 0 Detail 0 Rocket_Status 0 Price 3360 Mission_Status 0 dtype: int64
As we can see, we cannot drop NaN values here, since the price is missing in about 75% of the rows, so we will deal with that once we start exploring the price column.
df_data.duplicated().values.any()
False
df_missions = df_data.drop(columns=['Unnamed: 0', 'Unnamed: 0.1'])
df_missions.head()
| Organisation | Location | Date | Detail | Rocket_Status | Price | Mission_Status | |
|---|---|---|---|---|---|---|---|
| 0 | SpaceX | LC-39A, Kennedy Space Center, Florida, USA | Fri Aug 07, 2020 05:12 UTC | Falcon 9 Block 5 | Starlink V1 L9 & BlackSky | StatusActive | 50.0 | Success |
| 1 | CASC | Site 9401 (SLS-2), Jiuquan Satellite Launch Ce... | Thu Aug 06, 2020 04:01 UTC | Long March 2D | Gaofen-9 04 & Q-SAT | StatusActive | 29.75 | Success |
| 2 | SpaceX | Pad A, Boca Chica, Texas, USA | Tue Aug 04, 2020 23:57 UTC | Starship Prototype | 150 Meter Hop | StatusActive | NaN | Success |
| 3 | Roscosmos | Site 200/39, Baikonur Cosmodrome, Kazakhstan | Thu Jul 30, 2020 21:25 UTC | Proton-M/Briz-M | Ekspress-80 & Ekspress-103 | StatusActive | 65.0 | Success |
| 4 | ULA | SLC-41, Cape Canaveral AFS, Florida, USA | Thu Jul 30, 2020 11:50 UTC | Atlas V 541 | Perseverance | StatusActive | 145.0 | Success |
df_missions.describe()
| Organisation | Location | Date | Detail | Rocket_Status | Price | Mission_Status | |
|---|---|---|---|---|---|---|---|
| count | 4324 | 4324 | 4324 | 4324 | 4324 | 964 | 4324 |
| unique | 56 | 137 | 4319 | 4278 | 2 | 56 | 4 |
| top | RVSN USSR | Site 31/6, Baikonur Cosmodrome, Kazakhstan | Wed Nov 05, 2008 00:15 UTC | Cosmos-3MRB (65MRB) | BOR-5 Shuttle | StatusRetired | 450.0 | Success |
| freq | 1777 | 235 | 2 | 6 | 3534 | 136 | 3879 |
df_organisations = df_missions.groupby('Mission_Status', as_index=False).agg({'Organisation': pd.Series.count})
df_organisations.sort_values('Organisation', ascending=False, inplace=True)
df_organisations
| Mission_Status | Organisation | |
|---|---|---|
| 3 | Success | 3879 |
| 0 | Failure | 339 |
| 1 | Partial Failure | 102 |
| 2 | Prelaunch Failure | 4 |
df_nr_launches = df_missions.groupby('Organisation', as_index=False).agg({'Mission_Status': pd.Series.count})
df_nr_launches.head()
| Organisation | Mission_Status | |
|---|---|---|
| 0 | AEB | 3 |
| 1 | AMBA | 8 |
| 2 | ASI | 9 |
| 3 | Arianespace | 279 |
| 4 | Arm??e de l'Air | 4 |
fig = px.bar(x=df_nr_launches.Organisation,
y=df_nr_launches.Mission_Status,
color=df_nr_launches.Mission_Status,
title='Number of Launches per Company')
fig.update_layout(xaxis_title='Organisation',
coloraxis_showscale=False,
yaxis_title='Number of Launches')
fig.show()
df_rockets = df_missions.groupby('Rocket_Status', as_index=False).count()
df_rockets.value_counts()
Rocket_Status Organisation Location Date Detail Price Mission_Status StatusActive 790 790 790 790 586 790 1 StatusRetired 3534 3534 3534 3534 378 3534 1 dtype: int64
Checking how many missions were sucessfull and how many failed.
df_mission_status = df_missions.groupby('Mission_Status', as_index=False).count()
df_mission_status.value_counts()
Mission_Status Organisation Location Date Detail Rocket_Status Price Failure 339 339 339 339 339 36 1 Partial Failure 102 102 102 102 102 17 1 Prelaunch Failure 4 4 4 4 4 1 1 Success 3879 3879 3879 3879 3879 910 1 dtype: int64
The price column is given in USD millions (we need to take care of NaN's here).
df_price_of_launches = df_data.dropna()
df_price_of_launches.Price.isna().values.any()
False
df_price_of_launches_sorted = df_price_of_launches.sort_values('Price', ascending=True)
hist = px.histogram(df_price_of_launches_sorted, x='Price',
nbins=40,
opacity=0.6)
hist.update_layout(xaxis_title='Price of Launch',
yaxis_title='Count')
hist.show()
Had to work the country names so I could get a consistent naming structure to chart it. (Used the iso31666 package)
df_country_launches = df_data.groupby(['Location'], as_index=False).agg({'Mission_Status': pd.Series.count})
df_country_launches['Location'] = df_country_launches.Location.str.split(',').str[-1]
df_country_launches.head()
| Location | Mission_Status | |
|---|---|---|
| 0 | USA | 12 |
| 1 | France | 4 |
| 2 | USA | 1 |
| 3 | USA | 6 |
| 4 | France | 15 |
df_missions['launch_country'] = df_missions["Location"].str.split(", ").str[-1]
df_missions['launch_country'].replace({'Gran Canaria': 'USA', 'Yellow Sea': 'China',
'Pacific Missile Range Facility': 'USA', 'Barents Sea': 'Russian Federation',
'Russia': 'Russian Federation', 'Pacific Ocean': 'USA',
'Marshall Islands': 'USA', 'Iran': 'Iran, Islamic Republic of',
'North Korea': "Korea, Democratic People's Republic of", 'South Korea': "Korea, Republic of",
'Shahrud Missile Test Site': "Iran, Islamic Republic of", 'New Mexico': "USA" },
inplace=True)
df_missions['launch_country_code'] = df_missions['launch_country'].apply(lambda x: (countries.get(x).alpha3))
df_launches = df_missions.groupby(['launch_country', 'launch_country_code'], as_index=False).agg({'Mission_Status': pd.Series.count})
df_launches.rename(columns={'Mission_Status': 'Total_launches'}, inplace=True)
fig = px.choropleth(data_frame=df_launches,
locations='launch_country_code',
color='Total_launches',
color_continuous_scale='matter')
fig.update_layout(coloraxis_showscale=True,)
fig.show()
df_country_fails = df_missions.where(df_missions.Mission_Status != 'Success')
df_country_fails = df_country_fails.groupby(['launch_country', 'launch_country_code'], as_index=False).agg({'Mission_Status': pd.Series.count})
df_country_fails.rename(columns={'Mission_Status': 'Total_launches'}, inplace=True)
df_country_fails.head()
| launch_country | launch_country_code | Total_launches | |
|---|---|---|---|
| 0 | Australia | AUS | 3 |
| 1 | Brazil | BRA | 3 |
| 2 | China | CHN | 25 |
| 3 | France | FRA | 18 |
| 4 | India | IND | 13 |
fig2 = px.choropleth(df_country_fails, locations='launch_country_code',
hover_name="Total_launches", # column to add to hover information
color_continuous_scale=px.colors.sequential.matter)
fig2.update_layout(coloraxis_showscale=True,)
fig2.show()
country_org_status = df_missions.groupby(['launch_country_code', 'Mission_Status', 'Organisation'], as_index=False).agg({'Mission_Status': pd.Series.count})
country_org_status
| launch_country_code | Organisation | Mission_Status | |
|---|---|---|---|
| 0 | AUS | CECLES | 2 |
| 1 | AUS | RAE | 1 |
| 2 | AUS | AMBA | 1 |
| 3 | AUS | CECLES | 1 |
| 4 | AUS | RAE | 1 |
| ... | ... | ... | ... |
| 127 | USA | Sea Launch | 33 |
| 128 | USA | SpaceX | 94 |
| 129 | USA | ULA | 139 |
| 130 | USA | US Air Force | 129 |
| 131 | USA | US Navy | 2 |
132 rows × 3 columns
burst = px.sunburst(country_org_status,
path=['launch_country_code', 'Organisation', 'Mission_Status'],
values='Mission_Status',
title='Countries, organisations, and mission status.',
)
burst.update_layout(xaxis_title='Number of Missions',
yaxis_title='Organisation',
coloraxis_showscale=False)
burst.show()
Without NaN values
df_missions.Price = pd.to_numeric(df_missions.Price, errors = 'coerce') #'coerce' means that invalid parsing will be set as NaN.
df_missions.Price = df_missions.Price.fillna(0)
df_missions.Price.sum()
123175.68000000002
df_missions.Price.describe()
count 4,324.00 mean 28.49 std 85.93 min 0.00 25% 0.00 50% 0.00 75% 0.00 max 450.00 Name: Price, dtype: float64
org_price = df_missions.groupby(['Price', 'Organisation'], as_index=False).agg({'Price': pd.Series.sum})
org_price.sort_values('Price', ascending=False, inplace=True)
org_price
| Organisation | Price | |
|---|---|---|
| 122 | NASA | 61,200.00 |
| 119 | Arianespace | 15,000.00 |
| 101 | ULA | 3,706.00 |
| 121 | ULA | 3,500.00 |
| 98 | MHI | 2,520.00 |
| ... | ... | ... |
| 45 | UT | 0.00 |
| 44 | US Navy | 0.00 |
| 43 | US Air Force | 0.00 |
| 42 | ULA | 0.00 |
| 0 | AEB | 0.00 |
123 rows × 2 columns
fig3 = px.bar(x=org_price.Organisation,
y=org_price.Price,
title='Money Spent by Organisation')
fig3.update_layout(xaxis_title='Organisation',
coloraxis_showscale=False,
yaxis_title='Money Spent')
fig3.show()
df_missions.Date = pd.to_datetime(df_missions.Date, utc=True)
df_missions_launch = df_missions.groupby(df_missions.Date.dt.year).count()
plt.figure(figsize=(16,8), dpi=200)
plt.title('Number of Launches per Year', fontsize=18)
plt.scatter(x=df_missions_launch.Date.index,
y=df_missions_launch.Date.values,
color='dodgerblue',
alpha=0.7,
s=100)
plt.show()
df_missions['year'] = pd.DatetimeIndex(df_missions['Date']).year
df_missions['month'] = pd.DatetimeIndex(df_missions['Date']).month
df_missions.head()
| Organisation | Location | Date | Detail | Rocket_Status | Price | Mission_Status | launch_country | launch_country_code | year | month | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | SpaceX | LC-39A, Kennedy Space Center, Florida, USA | 2020-08-07 05:12:00+00:00 | Falcon 9 Block 5 | Starlink V1 L9 & BlackSky | StatusActive | 50.00 | Success | USA | USA | 2020 | 8 |
| 1 | CASC | Site 9401 (SLS-2), Jiuquan Satellite Launch Ce... | 2020-08-06 04:01:00+00:00 | Long March 2D | Gaofen-9 04 & Q-SAT | StatusActive | 29.75 | Success | China | CHN | 2020 | 8 |
| 2 | SpaceX | Pad A, Boca Chica, Texas, USA | 2020-08-04 23:57:00+00:00 | Starship Prototype | 150 Meter Hop | StatusActive | 0.00 | Success | USA | USA | 2020 | 8 |
| 3 | Roscosmos | Site 200/39, Baikonur Cosmodrome, Kazakhstan | 2020-07-30 21:25:00+00:00 | Proton-M/Briz-M | Ekspress-80 & Ekspress-103 | StatusActive | 65.00 | Success | Kazakhstan | KAZ | 2020 | 7 |
| 4 | ULA | SLC-41, Cape Canaveral AFS, Florida, USA | 2020-07-30 11:50:00+00:00 | Atlas V 541 | Perseverance | StatusActive | 145.00 | Success | USA | USA | 2020 | 7 |
cols=["year","month"]
df_missions['year-month'] = df_missions[cols].apply(lambda x: '-'.join(x.values.astype(str)), axis="columns")
df_missions.head()
| Organisation | Location | Date | Detail | Rocket_Status | Price | Mission_Status | launch_country | launch_country_code | year | month | year-month | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | SpaceX | LC-39A, Kennedy Space Center, Florida, USA | 2020-08-07 05:12:00+00:00 | Falcon 9 Block 5 | Starlink V1 L9 & BlackSky | StatusActive | 50.00 | Success | USA | USA | 2020 | 8 | 2020-8 |
| 1 | CASC | Site 9401 (SLS-2), Jiuquan Satellite Launch Ce... | 2020-08-06 04:01:00+00:00 | Long March 2D | Gaofen-9 04 & Q-SAT | StatusActive | 29.75 | Success | China | CHN | 2020 | 8 | 2020-8 |
| 2 | SpaceX | Pad A, Boca Chica, Texas, USA | 2020-08-04 23:57:00+00:00 | Starship Prototype | 150 Meter Hop | StatusActive | 0.00 | Success | USA | USA | 2020 | 8 | 2020-8 |
| 3 | Roscosmos | Site 200/39, Baikonur Cosmodrome, Kazakhstan | 2020-07-30 21:25:00+00:00 | Proton-M/Briz-M | Ekspress-80 & Ekspress-103 | StatusActive | 65.00 | Success | Kazakhstan | KAZ | 2020 | 7 | 2020-7 |
| 4 | ULA | SLC-41, Cape Canaveral AFS, Florida, USA | 2020-07-30 11:50:00+00:00 | Atlas V 541 | Perseverance | StatusActive | 145.00 | Success | USA | USA | 2020 | 7 | 2020-7 |
df_nr_launches_month = df_missions.groupby('year-month', as_index=False).agg({'Mission_Status': pd.Series.count})
df_nr_launches_month
| year-month | Mission_Status | |
|---|---|---|
| 0 | 1957-10 | 1 |
| 1 | 1957-11 | 1 |
| 2 | 1957-12 | 1 |
| 3 | 1958-10 | 3 |
| 4 | 1958-11 | 1 |
| ... | ... | ... |
| 742 | 2020-4 | 5 |
| 743 | 2020-5 | 9 |
| 744 | 2020-6 | 7 |
| 745 | 2020-7 | 14 |
| 746 | 2020-8 | 3 |
747 rows × 2 columns
fig3 = px.bar(x=df_nr_launches_month['year-month'],
y=df_nr_launches_month.Mission_Status,
title='Number of Launches by month')
fig3.update_layout(xaxis_title='Number of Launches',
coloraxis_showscale=False,
yaxis_title='Month')
fig3.show()
df_nr_launches_month.sort_values(by='Mission_Status', ascending=False)
| year-month | Mission_Status | |
|---|---|---|
| 156 | 1971-12 | 18 |
| 212 | 1975-9 | 16 |
| 236 | 1977-9 | 16 |
| 730 | 2019-12 | 16 |
| 123 | 1968-4 | 16 |
| ... | ... | ... |
| 550 | 2004-10 | 1 |
| 272 | 1980-9 | 1 |
| 271 | 1980-8 | 1 |
| 62 | 1963-3 | 1 |
| 0 | 1957-10 | 1 |
747 rows × 2 columns
df_missions['month'].value_counts()
12 450 6 402 4 383 10 381 8 373 9 365 3 353 7 351 2 336 11 336 5 326 1 268 Name: month, dtype: int64
launch_price = df_missions.groupby(['year'], as_index=False).agg({'Price': pd.Series.sum})
fig2 = px.scatter(x=launch_price.year,
y=launch_price.Price,
title='Launch Price Over Time')
fig2.update_layout(xaxis_title='Categories',
coloraxis_showscale=False,
yaxis_title='Launch Price')
fig2.show()
nr_launches_org = df_missions.groupby(['year', 'Organisation'], as_index=False).agg({'Mission_Status': pd.Series.count})
nr_launches_org
| year | Organisation | Mission_Status | |
|---|---|---|---|
| 0 | 1957 | RVSN USSR | 2 |
| 1 | 1957 | US Navy | 1 |
| 2 | 1958 | AMBA | 7 |
| 3 | 1958 | NASA | 2 |
| 4 | 1958 | RVSN USSR | 5 |
| ... | ... | ... | ... |
| 658 | 2020 | Roscosmos | 4 |
| 659 | 2020 | SpaceX | 14 |
| 660 | 2020 | ULA | 4 |
| 661 | 2020 | VKS RF | 3 |
| 662 | 2020 | Virgin Orbit | 1 |
663 rows × 3 columns
fig2 = px.scatter(x=nr_launches_org.year,
y=nr_launches_org.Organisation,
title='Number of Launches over Time by the Top 10 Organisations',
color=nr_launches_org.Mission_Status)
fig2.update_layout(xaxis_title='Year',
coloraxis_showscale=False,
yaxis_title='Organisation')
fig2.show()
The cold war lasted from the start of the dataset up until 1991.
cold_war = df_missions.where(df_missions.year < 1991)
cold_war = cold_war.loc[(cold_war['launch_country_code'] == 'USA') | (cold_war['launch_country_code'] == 'RUS') | (cold_war['launch_country_code'] == 'KAZ')]
cold_war.head()
| Organisation | Location | Date | Detail | Rocket_Status | Price | Mission_Status | launch_country | launch_country_code | year | month | year-month | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1776 | RVSN USSR | Site 32/2, Plesetsk Cosmodrome, Russia | 1990-12-22 07:28:00+00:00 | Tsyklon-3 | Cosmos 2114 to 2119 | StatusRetired | 0.00 | Success | Russian Federation | RUS | 1,990.00 | 12.00 | 1990-12 |
| 1777 | RVSN USSR | Site 133/3, Plesetsk Cosmodrome, Russia | 1990-12-10 07:54:00+00:00 | Cosmos-3M (11K65M) | Cosmos 2112 | StatusRetired | 0.00 | Success | Russian Federation | RUS | 1,990.00 | 12.00 | 1990-12 |
| 1778 | RVSN USSR | Site 90/20, Baikonur Cosmodrome, Kazakhstan | 1990-12-04 00:48:00+00:00 | Tsyklon-2 | Cosmos 2107 | StatusRetired | 0.00 | Success | Kazakhstan | KAZ | 1,990.00 | 12.00 | 1990-12 |
| 1779 | NASA | LC-39B, Kennedy Space Center, Florida, USA | 1990-12-02 06:49:00+00:00 | Space Shuttle Columbia | STS-35 | StatusRetired | 450.00 | Success | USA | USA | 1,990.00 | 12.00 | 1990-12 |
| 1780 | General Dynamics | SLC-3W, Vandenberg AFB, California, USA | 1990-12-01 15:57:00+00:00 | Atlas-E/F Star-37S-ISS | DMSP F-10 | StatusRetired | 0.00 | Success | USA | USA | 1,990.00 | 12.00 | 1990-12 |
Included Kazakhstan since at the time of the cold war it was part of USSR
cold_war_launches = cold_war.groupby(['launch_country_code'], as_index=False).agg({'Mission_Status': pd.Series.count})
cold_war_launches
| launch_country_code | Mission_Status | |
|---|---|---|
| 0 | KAZ | 578 |
| 1 | RUS | 1163 |
| 2 | USA | 644 |
fig = px.pie(cold_war_launches,
labels=cold_war_launches.launch_country_code,
values=cold_war_launches.Mission_Status,
names=cold_war_launches.launch_country_code,
title='Total Number of Launches from the Cold War',
hole=0.3)
fig.update_traces(textposition='inside',
textfont_size=15,
textinfo='percent')
fig.show()
cold_war_yearonyear = cold_war.groupby(['year', 'launch_country_code'], as_index=False).agg({'Mission_Status': pd.Series.count})
cold_war_yearonyear
| year | launch_country_code | Mission_Status | |
|---|---|---|---|
| 0 | 1,957.00 | KAZ | 2 |
| 1 | 1,957.00 | USA | 1 |
| 2 | 1,958.00 | KAZ | 5 |
| 3 | 1,958.00 | USA | 23 |
| 4 | 1,959.00 | KAZ | 4 |
| ... | ... | ... | ... |
| 93 | 1,989.00 | RUS | 22 |
| 94 | 1,989.00 | USA | 16 |
| 95 | 1,990.00 | KAZ | 7 |
| 96 | 1,990.00 | RUS | 30 |
| 97 | 1,990.00 | USA | 26 |
98 rows × 3 columns
fig3 = px.bar(x=cold_war_yearonyear['year'],
y=cold_war_yearonyear.Mission_Status,
color=cold_war_yearonyear.launch_country_code,
barmode = 'group',
title='Number of Launches by year')
fig3.update_layout(xaxis_title='Year',
coloraxis_showscale=False,
yaxis_title='Number of Launches')
fig3.show()
mission_fails = cold_war.loc[(cold_war['Mission_Status'] == 'Failure')]
mission_fails.head()
| Organisation | Location | Date | Detail | Rocket_Status | Price | Mission_Status | launch_country | launch_country_code | year | month | year-month | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1795 | RVSN USSR | Site 45/2, Baikonur Cosmodrome, Kazakhstan | 1990-10-04 04:27:00+00:00 | Zenit-2 | Tselina-2 n†8 | StatusRetired | 0.00 | Failure | Kazakhstan | KAZ | 1,990.00 | 10.00 | 1990-10 |
| 1837 | Martin Marietta | SLC-40, Cape Canaveral AFS, Florida, USA | 1990-03-14 11:52:00+00:00 | Commercial Titan III | Intelsat 603 | StatusRetired | 136.60 | Failure | USA | USA | 1,990.00 | 3.00 | 1990-3 |
| 1885 | RVSN USSR | Site 32/2, Plesetsk Cosmodrome, Russia | 1989-06-09 10:10:00+00:00 | Tsyklon-3 | Okean 2a | StatusRetired | 0.00 | Failure | Russian Federation | RUS | 1,989.00 | 6.00 | 1989-6 |
| 2005 | General Dynamics | SLC-36B, Cape Canaveral AFS, Florida, USA | 1987-03-26 21:22:00+00:00 | Atlas-G Centaur-D1AR | FLTSATCOM-6 | StatusRetired | 0.00 | Failure | USA | USA | 1,987.00 | 3.00 | 1987-3 |
| 2040 | RVSN USSR | Site 32/2, Plesetsk Cosmodrome, Russia | 1986-10-15 05:24:00+00:00 | Tsyklon-3 | Cosmos 1786 to 1791 | StatusRetired | 0.00 | Failure | Russian Federation | RUS | 1,986.00 | 10.00 | 1986-10 |
hist2 = px.histogram(mission_fails, x='year',
nbins=40,
opacity=0.6)
hist2.update_layout(xaxis_title='Year',
yaxis_title='Number of Failed Launches')
hist2.show()
cold_war['Mission_Status'].value_counts(normalize=True) * 100
Success 88.09 Failure 9.14 Partial Failure 2.73 Prelaunch Failure 0.04 Name: Mission_Status, dtype: float64
launches_leaders_year = df_missions.groupby(['launch_country_code', 'year'], as_index=False).agg({'Mission_Status': pd.Series.count})
launches_leaders_year
| launch_country_code | year | Mission_Status | |
|---|---|---|---|
| 0 | AUS | 1967 | 1 |
| 1 | AUS | 1968 | 1 |
| 2 | AUS | 1969 | 1 |
| 3 | AUS | 1970 | 2 |
| 4 | AUS | 1971 | 1 |
| ... | ... | ... | ... |
| 407 | USA | 2016 | 27 |
| 408 | USA | 2017 | 30 |
| 409 | USA | 2018 | 34 |
| 410 | USA | 2019 | 27 |
| 411 | USA | 2020 | 21 |
412 rows × 3 columns
country_leaders_year = df_missions.groupby(['year', 'launch_country_code'])['Mission_Status'].count().reset_index().sort_values(['year', 'Mission_Status'], ascending=False)
country_leaders_year = pd.concat([group[1].head(1) for group in country_leaders_year.groupby(['year'])])
country_leaders_year.columns = ['year', 'launch_country_code', 'nr_launches']
fig5 = px.bar(
country_leaders_year,
x="year",
y="nr_launches",
color='launch_country_code',
title='Leaders by launches for every year (countries)')
fig5.update_layout(xaxis_title='Year',
coloraxis_showscale=False,
yaxis_title='Number of Launches')
fig5.show()
org_leaders_year = df_missions.groupby(['year', 'Organisation'])['Mission_Status'].count().reset_index().sort_values(['year', 'Mission_Status'], ascending=False)
org_leaders_year = pd.concat([group[1].head(1) for group in org_leaders_year.groupby(['year'])])
org_leaders_year.columns = ['year', 'organisation', 'nr_launches']
fig6 = px.bar(
org_leaders_year,
x="year",
y="nr_launches",
color='organisation',
title='Leaders by launches for every year (organisation)')
fig6.show()